feat(scraper-create): write {collector_id, ...} envelope to -o#6
Open
anil-bd wants to merge 1 commit into
Open
feat(scraper-create): write {collector_id, ...} envelope to -o#6anil-bd wants to merge 1 commit into
anil-bd wants to merge 1 commit into
Conversation
Today the file `-o create.json` writes only the final AI-progress
payload — no `collector_id`, no name, no view_url. The documented
recipe in references/recipes.md depends on jq reading the
collector_id out of that file:
COLLECTOR_ID=$(jq -r '.collector_id // .id' create.json)
bdata scraper run "$COLLECTOR_ID" ...
Today that returns the string "null" because the field doesn't exist
in the file. Every script that follows the docs to chain create →
run is silently broken.
This change wraps every termination path (success, AI-trigger
failure, status=failed, polling exception) in one envelope:
{
"collector_id": "c_...",
"name": "audit-r4-...",
"status": "done" | "failed" | "ai_trigger_failed" | "poll_failed",
"completed_steps": [...],
"view_url": "https://brightdata.com/cp/scrapers/c_...",
"created_at": "2026-05-18T07:28:30Z",
"error": "..." // failure paths only
}
Notable design choices:
* Every termination path writes the same shape, including failure
paths that previously wrote nothing. So a script using
`jq -r '.collector_id'` always recovers an id when one exists —
even from a stub collector that hit the AI-Flow parallel-job cap.
This makes good on SKILL.md's promise that every failure path
surfaces the collector_id.
* `view_url` is included on every envelope so the user has a one-
click recovery path to inspect / finish / delete the scraper in
the dashboard, without needing to know the URL pattern.
* `created_at` is taken from the template-creation response when
the API provides it (`Create_template_response.created`),
omitted otherwise — never invented.
* New `--legacy-output` flag preserves today's bare-progress shape
for one minor version so any existing scripts that depended on
the old shape have a migration window. Slated for removal in
the next major.
* Stdout (the success summary printed to TTY) is unchanged. Only
the machine-readable `-o` / `--json` / `--pretty` payload is
reshaped.
* Scoped strictly to `src/commands/scraper.ts` and the new
envelope type. The shared HTTP client and other commands
(scrape, search, discover, pipelines, browser) are untouched.
Tests: 4 new `build_create_envelope` unit cases covering success,
omitted-created_at, failure-with-error, and view_url-on-every-
path. 5 new `handle_create_scraper` integration cases covering
success envelope, the documented jq recipe, --legacy-output
preserving the bare shape, AI-trigger failure envelope (the
stub-collector recovery path), poll-status-failed envelope, and
poll-exception envelope. Two existing tests updated from strict
opts-object matches to objectContaining-style (the contract is
now the envelope shape, not the bare payload).
55 / 55 scraper tests pass. The 9 pre-existing failures in
unrelated suites (daemon, add-mcp, browser, discover, scrape) on
main are unchanged by this PR.
Spec: brightdata/skills repo, proposal at
skills/scraper-studio/proposals/PR-2-create-envelope.md (to be
filed alongside this PR).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Today the file
-o create.jsonwrites only the final AI-progress payload — nocollector_id, no name, no view_url. The documented recipe in references/recipes.md depends on jq reading the collector_id out of that file:Today that returns the string "null" because the field doesn't exist in the file. Every script that follows the docs to chain create → run is silently broken.
This change wraps every termination path (success, AI-trigger failure, status=failed, polling exception) in one envelope:
Notable design choices:
jq -r '.collector_id'always recovers an id when one exists — even from a stub collector that hit the AI-Flow parallel-job cap. This makes good on SKILL.md's promise that every failure path surfaces the collector_id.view_urlis included on every envelope so the user has a one- click recovery path to inspect / finish / delete the scraper in the dashboard, without needing to know the URL pattern.created_atis taken from the template-creation response when the API provides it (Create_template_response.created), omitted otherwise — never invented.--legacy-outputflag preserves today's bare-progress shape for one minor version so any existing scripts that depended on the old shape have a migration window. Slated for removal in the next major.-o/--json/--prettypayload is reshaped.src/commands/scraper.tsand the new envelope type. The shared HTTP client and other commands (scrape, search, discover, pipelines, browser) are untouched.Tests: 4 new
build_create_envelopeunit cases covering success, omitted-created_at, failure-with-error, and view_url-on-every- path. 5 newhandle_create_scraperintegration cases covering success envelope, the documented jq recipe, --legacy-output preserving the bare shape, AI-trigger failure envelope (the stub-collector recovery path), poll-status-failed envelope, and poll-exception envelope. Two existing tests updated from strict opts-object matches to objectContaining-style (the contract is now the envelope shape, not the bare payload).55 / 55 scraper tests pass. The 9 pre-existing failures in unrelated suites (daemon, add-mcp, browser, discover, scrape) on main are unchanged by this PR.
Spec: brightdata/skills repo, proposal at
skills/scraper-studio/proposals/PR-2-create-envelope.md (to be filed alongside this PR).